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Let (X„,i)i<i<„,ngN be a triangular array of row-wise station- 
ary R''-valued random variables. We use a "blocks method" to define 
clusters of extreme values: the rows of (^n.i) are divided into m„ 
blocks (Ynj), and if a block contains at least one extreme value, the 
block is considered to contain a cluster. The cluster starts at the 
first extreme value in the block and ends at the last one. The main 
results are uniform central limit theorems for empirical processes 
ZM) ■■= 7= E7=i(/(i^n,.) - Ef {¥„,,)), for = P{X„,, / 0} and 
/ belonging to classes of cluster functionals, that is, functions of the 
blocks Ynj which only depend on the cluster values and which are 
equal to if Ynj does not contain a cluster. Conditions for finite- 
dimensional convergence include /3-mixing, suitable Lindeberg con- 
ditions and convergence of covariances. To obtain full uniform con- 
vergence, we use either "bracketing entropy" or bounds on covering 
numbers with respect to a random semi-metric. The latter makes it 
possible to bring the powerful Vapnik-Cervonenkis theory to bear. 
Applications include multivariate tail empirical processes and em- 
pirical processes of cluster values and of order statistics in clusters. 
Although our main field of applications is the analysis of extreme 
values, the theory can be applied more generally to rare events oc- 
curring, for example, in nonparametric curve estimation. 

1. Introduction. The next challenge for extreme value statistics is mod- 
eling and estimation of the structure of clusters of extreme values. As one 
concrete example, the Europe 2003 heat wave may have killed around 60,000 
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persons. There has been a substantial discussion of whether it could be 
attributed to global warming. The Nature paper [Stott, Stone and Allen 
(2004)] uses extreme value methods with average summer temperature as a 
proxy for a heat wave to try to answer this question. However, the health 
effects are in reality linked to clusters of extremely high temperatures over 
much shorter time periods, and the fluctuations of temperature during this 
period determine risks. 

Similarly, river flooding may be caused by not just one extreme rainfall 
event, but also by the ground already being saturated with water due to 
high precipitation during the preceding 5-10 days. This was, for example, 
the case for the large flood which occurred in Northern Sweden on July 26, 
2000. Thus, again, an entire sequence of large values are at the center of 
interest. 

This paper develops an empirical limit theory for clusters of extremes in 
stationary sequences. It provides a unified basis for asymptotic analysis of 
statistical methods which aim at answering questions such as the ones above. 
Results include limit theorems for tail array sums, in particular, for multi- 
variate tail empirical processes, and for joint survival functions of the values 
and order statistics in a cluster. More special examples, such as upcrossings, 
compound insurance claims, kernel density and bootstrap estimators, are 
also studied. 

Estimation of the extremal index (roughly, the inverse of the expected 
clusters length) has received substantial attention in the extreme value 
statistics literature. The results of this paper can be used to prove asymptotic 
normality for a general type of estimator based on blocks of exceedances; 
see Drees (2010). There are also a few papers [e.g., Bortot and Tawn (1998), 
Sisson and Coles (2003)] on Markov chain modeling of clusters of extreme 
values. However, a major part of the work to develop useful statistical meth- 
ods for the structure of clusters of extremes still remains to be done. Our 
goal is that this paper will be useful for the analysis of existing methods, 
and that it will spur development of new methods. 

More specifically, we consider triangular arrays of row-wise stationary se- 
quences of random variables. The variables are assumed to take their values 
in some set E CW^, with = M and E = M.'^ as the standard examples. 
Clusters of extremes are defined through a "blocks" method. The variables 
in each row of the array are divided up into blocks, and a cluster of extremes 
starts with the first "extreme" value in a block, if there is such a value, and 
ends with the last one. Such a cluster is termed the "core" of the block. A 
function which maps a block into a real number is called a "cluster func- 
tional" if it only depends on the core of the block and if it equals for blocks 
without extremes. In contrast to standard uniform central limit theorems, 
cores (i.e., clusters of extremes) consist of a random number of variables, 
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and, hence, cluster functionals have to be defined on a space of vectors of 
arbitrary lengths. 

The aim is to prove uniform central limit theorems for interesting classes 
of cluster functionals. We throughout use /3-mixing (or, with another name, 
absolute regularity) as the basic dependence restriction. It is very widely ap- 
plicable and makes it possible to transfer calculations from dependent blocks 
to easier calculations with independent blocks. Finite-dimensional conver- 
gence of the cluster functionals in addition requires Lindeberg conditions 
and convergence of covariances. We use suitable formulations of "bracketing 
entropy" to give conditions for asymptotic tightness, and bounds on cov- 
ering numbers with respect to a random semi-metric to prove asymptotic 
equicontinuity. The latter, in particular, makes it possible to use the Vapnik- 
Cervonenkis theory to prove asymptotic equicontinuity. As usual, uniform 
central limit theorems follow from finite-dimensional convergence together 
with asymptotic tightness, or together with asymptotic equicontinuity. 

In the important context of estimation for panel count data, two arti- 
cles by Wellner and Zhang (2000, 2007) use uniform central limit theory for 
vectors of random lengths. These articles are aimed at the specific applica- 
tion and not at general theory. Hence, they use special properties (such as 
monotonicity) of the classes of functions, do not consider triangular arrays, 
assume that the vectors are independent, and, in the second paper, also 
assume that the lengths of the vectors are uniformly bounded. However, 
the basic tools to prove tightness, that is, random covering numbers for the 
general case, and bracketing entropy for the uniformly bounded case are the 
same as in the present paper. We have not found any other references on 
uniform central limit theory for random vectors with random lengths. 

One application of the theory of this paper is to multivariate tail empir- 
ical processes for stationary time series. Let (Xj)jgN be a time series with 
marginal survival function H = 1 — H . The univariate tail empirical process 
is defined as 



The multivariate tail empirical process is defined analogously; see Examples 
3.1 and 3.8 below. In the definition (itn)neN is an increasing sequence of 
thresholds such that Vn '■= P{Xi > Un} — )■ 0, and (a„)„gN is a sequence of 
positive normalizing constants such that the conditional distribution of X„^i 
given that Xn.,i > converges weakly to some nondegenerate limit. [In par- 
ticular, the distribution function (df) of Xi then belongs to the domain of 




X G [0, oo) 



where 
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attraction of some extreme value distribution.] Rootzen (1995, 2009) proved 
weak convergence of to a Gaussian process; see Example 3.8 for details. 
Such limit theorems have proved quite useful for semi-parametric statisti- 
cal analysis of the marginal tail behavior [Drees (2000, 2002, 2003)]. The 
present paper extends convergence to multivariate tail empirical processes 
and makes a small improvement of the results in Rootzen (2009). 

Tail empirical processes do not capture information on location in the 
extreme clusters, and hence do not catch the serial extremal dependence 
structures which are at the center of interest in connection with, for exam- 
ple, heat waves or river floods. A second class of applications of our main 
theorems is to joint survival functions and joint distributions of the order 
statistic of the values within an extreme cluster. 

The paper is organized as follows. In Section 2 we first introduce empirical 
processes of cluster functionals. This generalizes concepts first introduced by 
Yun (2000) and developed further by Segers (2003). We then derive uniform 
central limit theorems for these empirical processes under quite general ab- 
stract conditions. Sections 3 contains applications to tail array sums, with 
the multivariate tail empirical process as a prominent example. In Section 
4 we consider empirical processes of indicator variables, and, in particular, 
joint distributions of variables and of the order statistics in the clusters of 
extreme values. Proofs are given in Section 5. 

2. Limit theorems for general empirical cluster processes. This section 
first sets out the basic definitions and assumptions which are used through- 
out the paper and then, in Section 2.1, gives conditions for finite-dimensional 
convergence of the empirical processes (Z„(/))jgjr (defined below). The fol- 
lowing subsections consider asymptotic tightness and asymptotic equiconti- 
nuity of these empirical processes. As usual, finite-dimensional convergence 
together with either asymptotic tightness or asymptotic equicontinuity gives 
convergence of in the space (J-") of bounded functions indexed by J- . 

For some c? G N, let be a measurable subset of containing and let 
(-'^n,j)i<i<n,neN be a triangular array of row- wise stationary random vari- 
ables (r.v.'s) with values in E. Typically the {Xn^i) have been obtained by 
"renormalization" of some other process, where the renormalization maps all 
nonextreme values to 0. A generic example (cf. the Introduction) is = M 
and Xn^i = ( '^'~"" )-!-, where (Xj)jgN is a stationary univariate time series. 
Here Un tends to the right endpoint of the support of Xi, so that Xn^i is 
unless Xi is "large," that is, unless Xi> Un- 

The "empirical process Z„ of cluster functionals" is defined as 

V j = l 
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Here Y^j is the jth block of r„ consecutive values of the nth row of (Xn^i)- 
Thus, there are m„ := [n/r.„J := max{j G Nq | j < n/r„} blocks 

'•= {-^n,i){j-l)r„+l<i<jr„j 1 ^ J ^ nT-n, 

of length r^. We write Yn for a "generic block" so that Yn = 1^,1- The block 
lengths r„ tend to infinity, but slower than n, and 

Vn ■■= P{Xn,l / 0} ^ 0. 

Further, J-" is a class of "cluster functionals," that is, functions which only 
depend on the part of the block which contains all nonvanishing observations; 
see below. 

In the univariate case S = M, cluster functionals have been introduced by 
Yun (2000) and Segers (2003). The definition is as follows: 

Definition 2.1. (i) The set Eyj := {Ji^^E'' of vectors of arbitrary length 
is equipped with the cj-field Ey that is induced by the Borel-cx-fields on E\ 
/EN. 

(ii) For an arbitrary A; G N and x = (xi, . . . , x^) G the core x'^ G E\j of 
X is defined by 

._ i {xi)i,<i<i2, if a; 7^(0,..., 0), 
\ 0, otherwise, 

where 

11 := min{i £ {1, . . . ,k} \ xi^ 0}, 

12 := max{i £ {I, . . . ,k} \ Xi ^ 0}. 

The length of the core of x is defined as L{x) := ^2 — + 1 if x*^ 7^ and 
L{x) = if = 0. 

(iii) A measurable map / : {Eu,E,u) — ?• (IR,]B) is called a cluster functional 
if /(x) = f{x^) for aU x G Eyj, and /(O) = 0. 

Typical examples are functionals of the type 

k 

f{xi,...,Xk) :=^0(xO, 
1=1 

where (p: E satisfies 0(0) = 0, which are related to so-called tail array 
sums, and, in the case E = [0, 00), 

/(xi, . . . ,Xfc) := max Xj, 

l<i<k 

which corresponds to the (componentwise) maximum of a cluster. Many 
more examples will be discussed in Sections 3 and 4. 
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The proofs below will use the well-known "big blocks, small blocks" tech- 
nique together with a /3-mixing condition to boil down convergence to con- 
vergence of sums over i.i.d. blocks. The /3-mixing coefficients (also called the 
coefficients of absolute regularity) for {Xn,i)i<i<n are defined by 

^n,k:= sup e( sup \P{B\Bl,)-P{B)\), 

where ^ denotes the cr-field generated by {Xn,i)i<i<j- Since the Xn^i take 
values in a Polish space, the supremum can be taken over a countable set of 
-B's, and hence is measurable. [On general spaces "sup" has to be replaced by 
"ess-sup," which is defined as a measurable function which is a.s. larger than 
or equal to \P{B\Bl^^i) — P{B)\ for all B G and a.s. smaller than 

or equal to all other measurable functions with this property.] In addition 
to the /3-mixing coefficients and the lengths r„ of the big blocks, the "big 
blocks, small blocks" technique uses an intermediate sequence in of integers, 
the lengths of small blocks which are used to separate the big blocks in the 
proofs. 

Throughout we will use the following basic assumptions: 
(Bl) The rows (X„,i)i<j<„ are stationary, in = o{rn),in oo, r„ = o(n), 

TnVn 0, nVn OO, 

and 

(B2) f3n,i„^^0. 

Sometimes we will also use the assumption 
(B3) lim^^^oo lim sup„_^oo Pn,m = 0. 

It follows from r„f„ — >• that Vn ^ and hence that nonzero values of 
Xn,i are rare events. The most important example we have in mind are the 
standardized excesses given in (1.1). However, other examples occur in the 
context of nonparametric density estimation or nonparametric regression 
in a natural way (cf. Example 3.5). Since nVn is the expected number of 
nonzero values of {Xn,i)i<i<m the assumption nVn oo seems necessary if 
one wants to obtain normally distributed limits. 

More specifically, the assumption VnVn — means that the probability 
of a block being nonzero tends to zero. In particular, it implies that if the 
row variables are i.i.d., then, asymptotically, cores — or, equivalently, clusters 
of "extremes" — will have length one, as they intuitively should have. To see 
this, note that if the variables in a row are independent, then asymptotically 
the number of nonzero values in a block of length r„ has a Poisson distri- 
bution with mean r„u„ and that then the conditional probability that there 
are more than one nonzero value in a block, given that there is at least one 
nonzero value, is (approximately) (1 — e"''"-''" — rnVne~'^"'^") / (1 — e~'^"^^). 
This tends to z6ro if cind. only if Vn 

Vn — )• 0. 
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For a given sequence {rn)ne'N, assumption (B2) requires a minimum rate 
at which the mixing coefficients /3„^i tend to as Z — t- oo. The condition 
(B3), for example, holds if the are obtained by renormalizing a single 
absolutely regular process. 

Remark 2.2. (i) The proofs of Theorems 2.3 and 2.8, of Lemma 2.5(ii) 
and (iii), and of Lemma 5.1 below, in fact, do not use the assumption r„t;„ — )■ 
of (Bl), but only that — t- 0. The same remark applies to Theorem 
2.10 if one replaces condition (D5) below by the following slightly stronger 
version: For ah 5 > 0, 7^ G N, / G {0, 1}, (ei)i<i<L„^„/2j+i S {-1, 0, l}L-"/2J+i 

and k G {1, 2}, the map sup^,,e.F,p(/,c,)<5 E\=i^'^^' e,(/(y„*^.) - ^(i;*,,))' is 
measurable. 

Hence, these results hold also if the assumption r„7;„ — t- is replaced by 
the weaker f„ — 0. 

(ii) It is not essential that ii^ is a subset of M'^. Lideed, one may assume 
that Xn^i takes on values in an arbitrary set E. Then one chooses some 
special element cq £ E which takes over the role of 0. In this more general 
setting, a cluster functional is defined as a functional on U;gp^ -E' whose value 
is not changed if cq is added at the beginning or at the end of some vector 

in UeN-^'- 

2.1. Convergence of fidis. We first give a general result on the conver- 
gence of the finite-dimensional marginal distributions (fidis), and then in- 
troduce simpler, but more restrictive assumptions, which also are sufficient 
for convergence. Proofs are deferred to Section 5. 

We will use the notation x*^'^^ for the vector {xi, . . . ,Xk) made up by the 
first k components in the vector x, if x has at least k components, and 
otherwise x^^'^ = x. Similarly, we write x^^''^^ = (x^,...,Xfc) for the vector 
consisting of components number £ to number /c in x, if x has at least k 
components, and otherwise x^^'^'^ starts at component no. I and ends at the 
end of X (if x is shorter than I, then x^^''^-* = 0). As before, let be a class 

of cluster functionals, and recall that = 5^,1, where Y^^i is the first block 
in the nth row. For f £ J- write 

A„(/):=/(y„)-/(yi^"-^")) 

for the difference between / evaluated at the r„ components of the entire 
block and / evaluated at the first r„ — components of the block. The 
general "convergence conditions" are as follows: 

(CI) E{{/\n{f) - £^A„(/))^l||A„{/)_£A„(/)|<v^}) = 0{rnVn). 

P{|A„(/) - E/\n{f)\ > = o{rJn) 

for ah / G -F. 
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(C2) E{{f{Yn) - Ef{Yn))h{\f^y„yEfiY^)\>eV^}) = o{rnVn) 

Ve > 0, / G ^. 

(C3) -^Cov{f{Yn),g{Yn))^c{f,g) yf,geT. 

(r —£ ) 

The block Yn " " is obtained from Yn by omitting a small block of In 
observations at the end. Accordingly, (CI) means that asymptotically this 
omission does not influence the fidis of the empirical process of cluster func- 
tional (see the proof of Lemma 5.1). By the definition of cluster functionals, 
this is usually fulfilled if, with high probability, there are few or no nonzero 
observations in the omitted short blocks. Specifically, if components number 

fn — In + ^ l^i l^fn all ai'e zero, then Yn and In " " have the same core, 
and, thus, A„(/) = 0. 

Assumption (C2) is the standard Lindeberg condition. The assumption 
of convergence of covariances, (C3), is the final ingredient needed to ensure 
finite-dimensional convergence in the present triangular array setup. 

Theorem 2.3. Suppose the basic assumptions (Bl) and (B2) hold, and 
that (C1)-(C3) are satisfied. Then the fidis of the empirical process {Zn{f))f£j^ 
of cluster functionals converge to the fidis of a Gaussian process {Z{f))f£jr 
with covariance function c. 

In general, the convergence (C3) of the covariance function must be veri- 
fied directly. However, we also give additional sufficient conditions which are 
simpler to verify in some situations. A first very simple version, (C3'), re- 
quires convergence only after "truncation" to a fixed (but arbitrary) length. 
Before stating it, we recall the notation L{Yn) for the length of the core of 

Yn. 

(C3') For / G J" it holds that 

(2.1) hm limsup-^Eif{Ynfl{L{Y^)>k})=0, 

fc— >oo rt— >oo rnVn 

and for f,g £ there is a sequence Rn^k with lim/j^oo ^^^^'^Pn^oo\^n,k\ = 
such that 

(2.2) lim J—E{f{Yn)g{Yn)l{L(Y^)<k})+Rn,k = Ck{f,g). 

n-^oo TnUn 

A typical situation when (2.1) holds is when the cluster lengths {L{Yn))'^=i 
are tight under P{-\Yn 7^ 0) and {f{Yn)'^)n&i is uniformly integrable under 
P{- I y„ 7^ 0), for f £ J-. This follows from the observation that ^r^|-£'(')l — 
\E[- I Yn 7^ 0)1, which in turn follows from P{Yn 7^ 0) < r„f„,. 

In a second assumption (C3") we generalize the powerful results of Segers 
(2003) to the present abstract setting. In doing this, we do not aim at the 
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greatest possible generality, but give versions which suit our purposes best. It 
may be noted that, unlike in the situation considered by Segers, in general 
weak convergence of the indicators l|Q}.(X„^j) does not follow from weak 
convergence of Xn^i . In the statement of the condition we use that the value 
of a cluster functional / applied to a sequence {xi)i^^ with rux ■= sup{« G 
N I Xj 7^ 0} < oo can be defined in a natural way as f{{xi)i<i<m^). The 
conditions are as follows: 

(C3") 

(C3.1") There is a sequence W = (Wi)i6N of E-valued r.v.'s such that, for all 
A; G N, the joint conditional distribution p(^",«.i{o}(^n,0)i<i<fcl^n,i^o 
converges weakly to p(^»'^{o}(^i))i<i<fe ^ and all / G are a.s. con- 
tinuous with respect to the distributions of W^^^ and W^'^'^\ for all 
k, that is, 

P{W''^'''^ eDfk-i,Wi = 0\/i>k} 

(2.3) 

= P{W''''^ G Df^k,Wi = Vi > A:} = 

with Df^k denoting the set of discontinuity points of /|£;fe. 
(C3.2") For all / G the sequence (/(y„)^)neN is uniformly integrable un- 
der P(-)/(r.„i;„). 

Again, (C3.2") is implied by the perhaps more intuitive condition that 
(/(^n)^)neN is uniformly integrable under P{- | y„ 7^ 0). 

In the proof of the next two results we will, in fact, use a slightly weaker 
(but instead more complicated) version of (2.3); see Remark 2.6 below. 

Corollary 2.4. Suppose that (Bl), (B2) and (CI) are satisfied. If, 
furthermore, either (C2) and (C3') or else (B3) and (C3") hold, then the 
fidis of the empirical process (Z„(/))jgjr of cluster functionals converge to 
the fidis of a Gaussian process {Z{f)) f^jr . Specifically, (C3') implies that 
(C3) holds and that the covariance function c of Z is obtained as 

c{f,g) = lim Ck{f,g). 

// (C3") holds, then 

(2.4) cif,g) = E{{fg)iW) - ifg){W^^-'°°^)). 

Equation (2.4) is explained in Lemma 2.5 below. It generalizes the most 
important results of Segers (2003) to the present more abstract setting. 

Lemma 2.5. (i) // (Bl) and (B3) hold, then 

(2.5) EifiYr,) I y„ / 0) = ^Eifixt^) - f{xi^^^-^) I ^0) + o(l). 
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where the term o(l) tends to as n tends to oo uniformly for all cluster 
functionals f such that \\f\\oo ^ C, for any C G M, and 

0^ := E^^^ltR = p^xi';r„) ^ I X ^ o)(l + o(l)). 

(ii) If (Bl), (B3) and the assumption of (C3.1") all are satisfied, then 

(2.6) mw = sup{i > 1 | VF^ / 0} < oo 
and 

lim en = 0:= P{Wi = Vi > 2} = P{mw = 1} > 0. 

n— >oo 

(iii) If (Bl), (B3) and (C3.1") hold, then the conditional distribution 
pf{yn)\Y„^o coraen/es weakly to the probability measure 

N,w := ]{P{f{W) G •} - P{/(VF(2;oo)) g > 2}). 

Note that ^f^^iM.) = 1 by (ii). However, it is not so obvious that is 
indeed a positive (and hence a probabihty) measure. 

Remark 2.6. We wih prove Corohary 2.4 and Lemma 2.5 under the 
following weaker version of the continuity assumption (2.3): 

For A; G N and / C {1, . . . , fc} let Nkj := {x G | = 0, Vi G /, / 0, Vi ^ 
/} and denote by Df^k,l the set of discontinuity points of flN^i- Then we 
assume 

(2.7) P{VF(^) G Z)/,fc,/,VF('=+^'°°) = 0} = VA: G N, / C {1, . . . , fc}, 

(2.8) P{VF(2''=)GL'/,fc_i,/,VF('=+^'°°) = 0} = V/fc > 2, / C {1, . . . , - 1}. 

This version can be used in some examples where (2.3) is not satisfied, 
because the boundary of [0,00)^^ belongs to the discontinuity sets Df^k and, 
according to Lemma 2.5(ii), the r.v. Wi equals with positive probability 
for i> 1. 

In the situation considered by Segers (2003) [i.e., with Xn^i defined by 
(1.1) for a stationary time series whose finite-dimensional marginal distri- 
butions all belong to the domain of attraction of some extreme value dis- 
tribution], the sequence (Wi)jgN is related to the so-called tail sequence (or 
tail chain) (C/j)jgp^ [cf. Segers (2003), Theorem 2] via Wi = max(C/j,0). Then 
(C3") is automatically satisfied, for example, for bounded cluster functionals 
if m is a Lebesgue null subset of (0,oo)™ for all m and / G because 
the r.v.'s Ui are continuous. 

Further simpler, but more restrictive, sufficient conditions are given in 
Lemma 5.2 below. In particular, for bounded cluster functionals one obtains 
the following: 
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Corollary 2.7. // ||/||oo = sup^g^;^ \ f{x)\ < oo for all f € T and the 
conditions (Bl), (B2), (B3) and (C3.1") hold, then the fidis of the empirical 
process {Zn{f))f£jr of cluster functionals converge to the fidis of a Gaussian 
process {Z{f))f£jr with covariance function c defined by (2.4)- 

2.2. Asymptotic tightness. In this subsection we give conditions which 
ensure asymptotic tightness of Zn in the space 1^{J-). As a consequence, 
uniform central limit theorems for Zn hold if in addition the conditions of 
Theorem 2.3 are satisfied. The alternative route via asymptotic equiconti- 
nuity is considered in the next subsection. 

In general, the supremum of Zn{f) taken over uncountably many cluster 
functionals / need not be measurable. Hence, in some instances, one has to 
work with outer probabilities and expectations, denoted by P* and E* in the 
following; see van der Vaart and Wellner (1996), Section 1.2, for details. The 
sequence (Z„)„gpj is asymptotically tight if to any e > there is a compact 
set Kce°°{J') such that 

hm sup P* {ZniK^)<e for any 5 > 0. 

n— ^-oo 

Here is the set of elements in {F) which are at most a distance 5 away 
from K. 

We will use the assumptions (D1)-(D4) below to prove tightness. The first 
two assumptions in various ways restrict the sizes of the functions in T. In 
particular, (Dl) ensures that sample paths of Z„ belong to the space 
of bounded functions on T. Assumption (D3) is an asymptotic continuity 
condition on the covariance function which is needed to ensure that the 
limiting process has continuous sample paths. The most crucial condition, 
(D4) , restricts the complexity of the index set J- via the so-called bracketing 
entropy. To state this assumption, the following concept is needed. 

The bracketing number A^[.](e, J^, Lg) here is defined as the smallest num- 
ber Ngr such that for each n G N there exists a partition (J^^ k)i<k<Ne of 
such that 

(2.9) E* sup {f{Yn)-g{Yn)f<e\nVn yi<k<N,. 

The assumptions are as follows: 

(Dl) The index set J- consists of cluster functionals / such that E[f{Ynf') 
is finite for all n > 1 and such that the envelope function 

F{x) := sup|/(x)| 

is finite for all x E Eyj. 
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(D2) 

E*{F{Yn)l{FiY„)y,^}) = o(r„y^) Ve > 0. 

(D3) There exists a semi-metric p on T such that J-" is totally bounded (i.e., 
for all e > the set J- can be covered by finitely many balls with radius 
£ w.r.t. p) such that 

1 2 

limlimsup sup E{f{Yn) - g{Yn)) =0. 

(D4) 

limlimsup / \/log M.ife, J^, ) de = 0. 

Theorem 2.8. If the basic assumptions (Bl) and (B2) hold and (Dl)- 
(D4) are satisfied, then the process is asymptotically tight in {J-) . If in 
addition the finite- dimensional distributions converge [which, in particular, 
hold if (C1)-(C3) also are satisfied], then Zn converges to a Gaussian process 
Z with covariance function c. 

We collect a number of comments and variations of the conditions of the 
theorem in the following remark. In particular, we consider a strengthened 
version (D2') of (D2): 

(D2') E*{F\Yn)l^F(Y,.)-^e^})=0{rnVn) > 0. 

The proof of part (ii) of the remark is given in Section 5. 

Remark 2.9. (i) //, for all e > 0, there exists a partition {Ff,)i<k<N^ of 
J- which does not depend on n and which satisfies 

E* sup {f{Yn) - g{Yn)f < e^nVn VI < < N,, 

then (D3) and (D4) can be replaced with the simpler condition 

s 

ylogiVg de < oo 

for some 5>Q [cf. Theorem 2.11.9 of van der Vaart and Wellner (1996)]. 

(ii) If F{Yn) satisfies the Lindeberg condition (D2'), then (C2) and (D2) 
are satisfied. In particular, this holds if nvn — )• oo and 

(2.10) E*F{Ynf+^ = 0{rnVn) for some 6 >0. 

(iii) Thus, if (Bl), (B2), (C3), (Dl), (D3) and (D4) hold with a bounded 
envelope function F, then the empirical processes Z^ converge to a centered 
Gaussian process with covariance function c. 
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2.3. Asymptotic equicontinuity. Like tightness, the asymptotic equicon- 
tinuity of Zn w.r.t. p, that is, 

Ve,r/>0 3(5>0:hmsupP*| sup \Znif)-Zn{g)\>e}<r] 

is necessary and sufficient for the convergence of Zn, provided ah fidis of Zn 
converge. 

To prove asymptotic equicontinuity, we need a technical measurability 
condition, condition (D5) below, and, crucially, suitable bounds (D6) or 
(D6') on the rate of increase of covering numbers. The condition (D5), in 
particular, is satisfied if the processes {f(Yn))f£jr are separable. The con- 
dition (D6) is stated in terms of a "random entropy," while (D6'), which 
implies (D6), is phrased in terms of uniform entropy. To state the assump- 
tions, we need the following definitions: for a given semi- metric d on J^, the 
(random) covering number N{£,J^,d) is the minimum number of balls with 
radius e w.r.t. d needed to cover J^. The condition (D6) bounds the rate of 
increase of N{£,T,dn) as e tends to for the random semi-metric 

/ -, mn \ 1/2 

dn{f,g) := —Y.if{Y:,)-giYl^)f , 

that is, the L2-semi-metric w.r.t. empirical measure (nu^)"^ SJ^i ^Y* . , where 
Y*j, 1 < J < rUn, are i.i.d. copies of Yn^i- In (D6') we instead use the 
supremum of all covering numbers N{e,T,dQ), where dQ{f,g) := {f{f — 
gfdQY/^ and Q ranges over the set of discrete probability measures Q. 
With this notation, the conditions are as follows: 

(D5) For ah 5 > 0,n G N, (e,)i<i<Lr„„/2j G {-1,0, l}L-"/2J and k G {1,2}, 
the map f ,geT Af ,9)<^ Ej™!^^^ ej{f{Y*,j)-g{Y* -)f is measurable. 



(D6) 



limlimsupP*! / y/\og N{e, J", d„) rfe > r 1 = Vr > 0. 



(D6') The envelope function F is measurable with E{F{Yn)'^) = 0(r„f„,) 
and 



^ snp ^ log N(^e(^ J F^dQ^ ,T,dQ^de<oo. 

Theorem 2.10. Suppose the basic assumptions (Bl) and (B2) hold and 
that (Dl), (D2'), (D3) and (D5) are satisfied. Then if also (D6) [or, more 
restrictively, (06')/ holds, it follows that Zn is asymptotically equicontin- 
uous. Further, if in addition the finite- dimensional distributions converge 
[which, in particular, holds if (CI) and (C3) also are satisfied], then Zn 
converges to a Gaussian process with covariance function c. 



14 



H. DREES AND H. ROOTZEN 



Remark 2.11. In view of (D6'), one can apply the powerful Vapnik- 
Cervonenkis theory to verify asymptotic equicontinuity. In particular, (D6') 
is satisfied if is a so-called VC-class or, more generally, a VC-hull class. We 
refer to Section 2.6 of van der Vaart and Wellner (1996) for an outline of the 
most important uniform bounds on covering numbers iV(e(/ dQY''^ ^F^dq). 

3. Generalized tail array sums. Generalizing the tail empirical process 
en{x) (for some fixed x > 0), Rootzen, Leadbetter and de Haan (1990) con- 
sidered so-called tail array sums 

n 

(3.1) E'^^^".*) 

i=l 

for functions : M — ?• M satisfying 0(0) = and Xn^i defined by (1.1); see also 
Leadbetter and Rootzen (1993), Leadbetter (1995) and Rootzen, Leadbetter 
and de Haan (1998). 

Like the tail empirical process, these tail array sums do not allow inference 
about the extremal dependence structure, as the summands (j){Xn.i) depend 
on just one observation. However, if Xn^i denotes the vector of d consecutive 
standardized excesses, that is. 



(3.2) Xn, 



Xi—Un\ ( Xij^i —Un\ ( Xi^d-i — U 



then the statistic (3.1) with (j): {E,B{E)) (M,B) (and E = R'^) contains 
information on the extremal dependence structure. 

Therefore, in the general setting of a row-wise stationary triangular array 
iXn.i)nen,i<i<n uscd in Section 2, the generalized (standardized) tail array 
sum (tail array sum for short) given by a measurable function (p : (E, B{E)) — )■ 
(M,B) with (/)(0) = is defined as 

1 " 

(3.3) Zn{4^) := —= V(<A(^n,i) - Ecl){Xn,i)). 

The tail array sum (3.3) can be obtained as the empirical process eval- 
uated at the cluster functional 

k 

g^:Eu^R, x= (xi,...,Xfc) i-^^(/)(xi), 

i=l 

if n is a multiple of r„. In general, 

n 

i=rnmn+l 
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which is asymptoticahy neghgible under weak conditions specified in Corol- 
lary 3.6 below. 

For the remainder of this section, we assume that a family $ of functions 
of the above type is given, and assume it is totally bounded w.r.t. a semi- 
metric p$ and has a finite envelope function 0max := sup^g,5|0|. 

Example 3.1 (Multivariate tail empirical processes). If Xn^i is defined 
as in (3.2) and $ := {l(a;,oo) I ^ £ [0, oo)'^}, then {Zn{g(j,))(j,^^ is the 
(reparametrized) multivariate tail empirical process. In particular, if (i= 1, 
then {Zn{gcf,))(f,£^ is a reparametrization of the tail empirical process 
discussed in the Introduction. 

For simplicity, we will assume that the Xi are uniformly distributed; the 
general case can be easily obtained by a marginal quantile transformation [cf. 
Rootzen (2009) for details]. Then one chooses = 1 — u„ = w„ for a sequence 
of thresholds Un tending to 1, so that the conditional distribution of the stan- 
dardized excesses Xn.i = {Xi — Un)/o,n, given that they are strictly positive is 
also uniform. Thus, it suffices to consider <I> := I 2; G [0, l]'^} with enve- 

lope function (/»max = l(o,i]<* and metric p$(l(^^i] , l(j^^i]) := maxi<;<rf \xi-yi\, 
x,ye[0,lf. 

Example 3.2 (Upcrossings). If one is interested in upcrossings of a 
univariate time series over intervals [x,y], then one may define Xn,i as in 
Example 3.1 with d = 2 and consider <I> := {l[o,a;)x(j/,i] \ ^lU ^ [0, ^],x < y} 
with envelope function y-)g[oj]2|^.<j,}. 

Example 3.3 (Compound insurance claim). If Xi denotes the ith claim 
of an insurance portfolio with deductible Un + dnt and Xn,i as in (1.1), then 
0f :M — )■ [0,00) given by (j)t{x) = (x — t)l(^f.^^-^{x) is the standardized total 
claimed amount. Thus, the empirical process {Zn{g<j,t))t>o corresponding to 
$ := {{x — t)l(j oo)(x) I t > 0} describes the influence of the deductible on 
the random amount the insurance has to pay. 

Example 3.4 (Bootstrapping the Hill estimator). A stationary time 
series (Xj)jgp} has extreme value index 7 > if its marginal survival function 
F is regularly varying with index — 1/7, that is, if Imit^o^ F {tx) / F [t) = 
x"^/^. Let Xn^i := Xi/unl{x,>u„}i 'Piix) = log(x)l{^>i} and (/>2(x) = 
so that E(j)2{Xn,i) = Vn and 7„ = E(j)i{Xn,i)/E(l)2{Xn,i) = £'(/>i(X„,i)/-y„ = 
E(log{Xi /un)\Xi > Un) — )■ 7 [cf. dc Haan and Ferreira (2006), Theorem 1.2.1 
and Remark 1.2.3]. Then the Hill estimator 7„ of 7 may be written as 

. T,1=l'^Og{Xi/Un)l{X,>u„} Jn + Zn{(l)l)/^/™^ 

(3.4) 7„ := j = ^ . , w , • 
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Write Qk := g,f)f,, k G {1,2}, and suppose we draw independent blocks 
from the empirical distribution of Y^^i, 1 <i < m„. Then a bootstrap version 
of the Hill estimator is obtained as 



Example 3.5 (Kernel density estimators). In this simple example we 
demonstrate that applications of the theory presented in Section 2 are not 
restricted to extreme value theory. Further examples may be obtained from 
the literature on "local empirical processes." For the analysis of such pro- 
cesses for i.i.d. data we refer to Einmahl (1997), Gine, Mason and Zaitsev 
(2003) and Gine and Mason (2008) and to the lists of references in these 
papers. 

Suppose that (Xj)jgi^ is a univariate stationary time series whose marginal 
df H has a Lebesgue density h. Kernel estimators of the type 



are probably the most widely used nonparametric estimators for h{xQ) (xq G 
M). Here K denotes a suitable kernel, for example, a probability density with 
support [—1, 1], and (6n)n6N is a sequence of bandwidths tending to 0. Let 



where the constant 2 has been inserted to ensure Xn^i > for Xi £ [xq — 
bn.,XQ + bn]- Let Hn be the corresponding empirical df. Then integration by 
parts yields 





1 < i < n 






provided that K has bounded variation. Hence, for Zn{y) = ■^n(l(j/+2,oo)) 
y £ [—1,1], and n = rnrrin, we have that 
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where y/njvnhn ~ y^n/ {2h(xo)bn)bn = \J nbn/ (2/i(xo)) as n — )■ oo, if h is 
continuous and positive at xq. Thus, one obtains the asymptotic normahty 
of hn{xQ) from the convergence of Zn (or Z,„) toward a Gaussian process. 
Indeed, this way it is not difficult to derive normal approximations for /i„ 
uniformly over families of kernels with compact support. 

To obtain conditions for weak convergence of tail array sums, we first 
focus on families such that the envelope function (pmax is bounded, which 
is true in the Examples 3.1, 3.2 and 3.5, but not in Example 3.3 (unless the 
support of Xn,i is uniformly bounded). We let T := {g^ | G be equipped 
with the semi-metric p{g(f), g-ti,) = p^{4)^'4)). 

Corollary 3.6. Suppose that (/)jnax = sup0g4,|(/>| is hounded and mea- 
surable, that ^ is totally bounded w.r.t. that (Bl) and (B2) hold, and 
that Vn = o{^JnVn) ■ Further assume that 

(3-5) e{^1^x^^^^^ =0{rr,Vn). 

Then the conditions (CI), (Dl) and (D2') hold, and thus also (C2) and 
(D2) are satisfied. Moreover, 

(3.6) sup \ Zn{(t)) — Zn{g<f)) \ — >• in outer probability. 

If, in addition, (C3) holds and one of the following two sets of conditions, 

(i) (D4) with a partition of T independent of n, or 

(ii) (D3), (D5) and (D6), 

are satisfied, then {Zn{(p))ci>e^, and the empirical processes {Zn{g(i,))^^^ of 
cluster functionals, converge weakly to a Gaussian process with covariance 
function c. 

Remark 3.7. (i) It is possible to replace (C3) in the corollary by more 
basic assumptions. Specifically, assume that the cluster lengths L{Yn) satisfy 

(3.7) lim limsup^— P{L(y„) >k} = 

fc-5>oo n-s>oo rnVn 

[which by Lemma 5.^(vii) holds if (B3) is satisfied], that there exist functions 
dj : $2 M such that, for ke'N and </), G 

(3.8) —E{<j){Xn,i)'il^{Xn,k))^dk-i{(l),ip) asn^oo, 

Vn 



18 



H. DREES AND H. ROOTZEN 



and that 

(3-9) J]l|x„,^o}j =Oir^Vn) 

for some 6 > 0. Then (C3'), and hence, by Corollary 2.4, also (C3) hold 
with 

oo 

(3.10) c{g^,g^) =(io((/',^) + ^((ij(</>, ^) + (ii(^, (/>)). 

1=1 

The proof is given in Section 5. 

(ii) Suppose that the following simpler version of (C3") is satisfied, viz. 
that there exists a sequence {Wi)i^n of E-valued random variables such 
that, for all ken, p(^".i-^n,fc)l-Yn,i^o ^ p{Wi,Wk) weakly, with P{Wk G 
Dff) \ {0}} = for all (p £ ^, k £ N, where is the discontinuity set of 
<j). Then, in view of Lemma 2.5, Remark 2.6 and the boundedness of (p and 

— E<P{Xn,l)ll^{Xn,k) = E{(t>{Xn,l)llj{Xn,k)\Xn,l^G) 
Vn 

^E<t){Wi)iP{Wk)=:dk-i{(P,i^), 

so that equation (3.8) holds. 

Example 3.8 (Multivariate tail empirical processes, ctd.). In this ex- 
ample we give a set of conditions for the convergence of the multivariate tail 
empirical process from Example 3.1 for uniformly distributed r.v.'s X^. We 
then discuss how the condition (C3) on convergence of covariances may be 
checked in the present situation. Finally, we show that the central condi- 
tion (3.11) may be weakened in the univariate case to condition (3.13). This 
improves earlier results in the literature. 

Thus, we first show that if r„ = o(y'm^), (Bl), (B2) and (C3) are satisfied, 
and there exist a constant K and a 5 > such that, for all sufficiently large 
n, 

(3.11) 

VO<a;<y< l,y-x<l/2, 

then the multivariate tail empirical process 

( V(l(.,l](X„,,) - P{Xn,i G (X, 1])) I 

\ \/nv„ ^-^ I 

\^ j=i / xe[o,i]d 
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converges weakly to a Gaussian process with covariance function c. 

Clearly, (3.11) implies (3.5). By Corollary 3.6, it is hence enough to show 
that condition (i) of the corollary is satisfied. Now, to each e > 0, let rj = 
r/e := exp{—{K~^d~^e'^)~^/^^^^^) and define sets 

■= {^xt^{xl,l] I {il-'^)v<xi<mm{in],l) VI < / < 4, 

{1,..., \l/v]}, 

such that Un,...,»,e{i,...,ri/r,l}^(n,...,i,) = Since, by (Bl) and (3.11), 
E sup \g^{Y„) - g^iYn)]"^ 

(r„ d 
i=l 1=1 

< (fE max V 1 
i<l<d \ ^ 



{{i[-l)r],iiri] 

i=l 




2 



it follows that 



log 7V[.] (e, L^) < log( [1/771 = 0(e-2/{i+5) ) 

as e J, 0. Hence, the condition (D4) on entropy with bracketing holds with a 
partition independent of n, as required to prove the claim. 

The convergence (C3) of covariance functions which was used above may 
sometimes be replaced by simpler conditions. Specifically, Remark 3.7 gives 
sufficient conditions for (C3) to hold, for general d G N. Assume, for example, 
that all bivariate distributions {Xi,Xm) belong to the domain of attraction 
of some bivariate extreme value distribution. Then, since the limiting ran- 
dom variables Wi are continuous on (0, oo), the assumptions of Remark 
3.7(ii) are satisfied, and, hence, (3.8) holds [cf. Segers (2003), Theorem 2]. 
Further, condition (3.9) holds if and only if for some 5 > 

/ r„ \ 2+5 

(3.12) 5^ (X,) J =0{rnVn). 
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For the case d=l, the condition (3.11) can be weakened, to the require- 
ment that 

(3.13) ^(X]l(x,y]f ^' <hiy-x)rnVn VO<x<y<l, 



for some function h: (0,oo) — ?• (0,oo) satisfying hm^j^o ^(^) = 0. To see this, 
note that the functions (px = ^ ^ [0? 1]; hnearly ordered, and hence 

so are the corresponding cluster functionals , x G [0, 1] . Hence, = {g^j,^ \ 
X E [0, 1]} is a VC class of functions [van der Vaart and Wellner (1996), Sec- 
tion 2.6]. Thus, according to Remark 2.11, (D6') [and hence also (D6)] is 
satisfied. The measurability condition (D5) holds, since all processes occur- 
ring in this setting are separable. Moreover, (D3) is satisfied for the metric 
p{9<t>^,9ct>y) ■= \y-x\: 



lim sup ^— sup E{g^^ (y„) - g^^ (Yn)) 

n^oo TnVn x,y&[0,l],\y-x\<5 



2 



1 / 

limsup sup Eiy2'i-(x,y]{Xn,; 

n-s>oo rnVn x,ye[0,l],\y-x\<S ' 



\«=1 / 

< sup h{t) 
0<t<5 

^0 

as (5 4, by (3.13), so that version (ii) of Corollary 3.6 applies. This proves 
the claim that (3.11) may be weakened to (3.13) in the univariate case. 

If we could assume that {Xi] 1 < i < n} could be split up into consecutive 
independent blocks of length then (3.13) would be seen to be the same 
as to assume that E{Zn{g^y) — Z„(g0^))^ < h{\y — x|), for some h with prop- 
erties as above. This is the same as to assume that Zn is uniformly mean 
square continuous. However, in the proofs in Section 5 we use mixing to 
translate to cases where this independence assumption in fact can be made, 
and, accordingly, (3.13) seems quite minimal. In fact, in view of the coun- 
terexamples in Hahn (1977), it may even be surprising that this condition 
is sufficient. 

Rootzen (1995, 2009) proved convergence of the univariate tail empirical 
process using a more restrictive version of (3.11) and the stronger condi- 
tion that r„ = o((nu„)^/^~^) for some e > 0. In Drees (2000) Rootzen's condi- 
tions were slightly weakened, to the recjuiremeiits that v^i — 
o{{nVnY^'^^og~'^{nVn)) and that 

(3.14) <K{y-x)rnVn VO<x<y<l, 
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instead of (3.11). Condition (3.14) is much more restrictive than (3.11) for 
smah y — X. In many specific time series models, it was condition (3.14) (for 
smah y — x) that turned out to be most difficult to verify; see, for example, 
the discussion of the solutions of a stochastic recurrence equation in Drees 
(2000), Section 4. Therefore, it might be useful that the bound in (3.11) 
converges to much more slowly as y — x tends to 0. 

It is possible to deal with Examples 3.2 and 3.5 in a similar fashion. 

As already mentioned, Example 3.3 does not fit into the framework of 
Corollary 3.6 if the underlying df belongs to the domain of attraction of an 
extreme value distribution with nonnegative extreme value index, because 
then the support is not bounded. In that case, condition (3.5) must be 
strengthened. 

Corollary 3.9. In the setting of Corollary 3.6 the assertions remain 
true if (praax is measurable hut not necessarily hounded, provided (3.5) is 
replaced with 



Example 3.10 (Compound insurance claim, ctd.). In the setting of Ex- 
ample 3.3, uniform convergence of the empirical process of cluster function- 
als can be expected only if the deductible t is restricted to some bounded 
set. Therefore, we consider the set $t := {(j)t \ t G [0,r]} for an arbitrary 
T e (0, oo) . This set is totally bounded w.r .t. the metric d$ (0^ ,4>t) '■= I s — ^ I ■ 
The envelope function is (pmaxix) = (poix) = Xj^. 

Suppose conditions (Bl), (B2), (C3), (3.5) and 



for some (5 > 0, are satisfied. Then the empirical process (^n(<7(^t))o<t<T 
converges weakly to a Gaussian process. 

To see this, first observe that the functions (/>t are monotonically decreas- 
ing in t. Hence, is a VC class of functions, so that (D6) holds (see Remark 
2.10). Since all sample paths are continuous, the measurability condition 
(D5) trivially holds. 

To prove (D3), check that 



(3.15) 




for some 5 > 0. 



(3.16) 





) 



2 



1 

sup 

0<s<t<T,|t-s|<(5 ^n'V; 



'n 



EiY,iiXn,^-s)+-{Xr,,^-t) + ) 
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< sup - 

0<s<t<T,\t-s\<5 ^: 



— E(f2it-s)Ms,oc){Xn,i)] 
\i=l / 

nVn J 



r. 

By (3.5), the limsup of the right-hand side (as n tends to oo) is bounded by 
a multiple of 5^, which yields (D3). Further, (3.16) is just a reformulation of 
(3.15) to the present setting. Hence, all the conditions of Corollary 3.9 have 
been verified, and thus the result follows. 

By Corollary 2.4, the condition (C3) in turn follows if, in addition, one 
assumes that all finite-dimensional marginal distributions of the time series 
(Xj)jgN belong to the domain of attraction of some extreme value distribu- 
tions and that the normalizing constants Un and a„ are chosen accordingly. 
Then (C3.1") holds [cf. Segers (2003), Theorem 2], and (C3.2") also follows 
from (3.15) and Lemma 5.2(vi). 

Example 3.11 (Bootstrapping the Hill estimator, ctd.). Continuing Ex- 
ample 3.4, we now sketch proofs of asymptotic normality of the Hill esti- 
mator and of consistency of the block bootstrap. Full process convergence 
may also be obtained and is useful if, for example, Un is replaced by the 
A;„th largest order statistic, for some suitable sequence We use asymp- 
totic normality to show consistency of the block bootstrap — but the hope 
is that the bootstrap has better small-sample properties than the normal 
approximation with estimated variance. 

For this we assume that (Bl) and (B2) and, with the notation of Example 
3.4, that for k,l£ {1,2} 

(3.17) E[Yl^^u{Xn,i)^ =0{rnVn), 

lim y^ y^ Ei(l)k{Xn,i)(l)l{Xnj)) = Ukl- 

n^oo rnVn ^ ^ 
1=1 j=l 

Then, in a similar way as in the proofs of Corollaries 3.6 and 3.9, it can be 
seen that {Zn{4>k))i<k<2 converges to a centered normal distribution with 
covariance matrix {(Jki)i<k,i<2- It follows that 

(3.18) 7n = 7n + {nVn)-^'^{Zn{(^l) " 7^n(<^2)) + Op{{nVn)-^'^) , 

and thus that 

(3.19) y/rm^li^n - In) — ^■^(0,<7ii+72<722-27<7i2) distribution. 



EMPIRICAL CLUSTER PROCESSES 23 

Writing X^") := (A^j)i<i<n, for the original data, we next show that 

(3.20) sup|P(V^(7:-7n) < 1 1 -P{V^(7„-7„) < t}\ = op(l), 

teM 

that is, consistency of the block bootstrap estimator. With the notation from 
Example 3.4, 



In- 



From arguments as in the proof of Lemma 5.1 below [in particular, (5.4)], 
it follows that if condition (3.17) holds, then Zniokdi) =Op{l). Hence, for 

k,le{i,2}, 

J-Coyigk{Yl%iiY}''^)\X^^^) 



TriVn 



1 / 1 



m„ ^ m„ ^ rrin \ 
'^9k{Yn,i)gi{Yn,i) ^9k{Yn,i) S^gi{Yn,i) 



^— Cov(gfc(y„,i),5r/(y„,i)) ^—Zn{gk)Zn{9l) 



+ 



1 



/nv, 



:{Zn{gkgi) - E{gi{Yn,i))Zn{gk) " E{gk{Yn,i))Zn{gi)) 



<^kl 



in probability. Similarly, as in (3.18), we have that 

m„ 

ll = ln + (nvn)-' T.(^9iiYt^) - 192{Y^^) - Eig,iY}-^) - ^g,{Y^^) \ X^)) 



i=l 



+ Op{{nVn) ^). 

Moreover, one can conclude from (3.17) that 



rUnE 



gkjYj^^) - E{gk{ Yr)\X 



(«)N|3^(n)N\ 3 



XW ]=0p{mn{nvn)-^'^rnvn) 
= 0p((™„)-V2), 



and, thus, the Berry-Esseen inequality yields 

/ rrin 

sup P[ {nVr.)-^I^Y.^g,{Yt^)-^g2{Y^''^) 



1=1 



i?(gi(y/"))-752(>^r)l^^"^))<i 
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- $((^711 + 7V22 - 27(712)^^/2*) 

In view of (3.19), this proves (3.20). 



op(l). 



4. Indicator functionals. Another important class of cluster functionals 
are indicator functions. Notice that by definition these indicator functions 
are applied to whole clusters, while in the Examples 3.1, 3.2 and 3.5 above 
indicator functions of single observations X^^i were summed up. For C C Eu 
the indicator function Ic is a cluster functional if and only if the set satisfies 
the following two conditions: 

• X = {xi, . . . ,X£) £ C (0, xi, . . . , Xf) S C <;=^ (xi, . . . , x^, 0) G C for all 
X e Eu, 

• O^C. 

In this section we study situations where the set of cluster functionals is 
of the form{J^ = {Ic | C G C} for some family C C 2^^ of such sets. 

Example 4.1 (Joint survival function of cluster values). The conditional 
joint survival function of the first k observations in a cluster core Y^, given 
that the core has length greater than or equal to k, can be estimated by 

Ej^l lCf^,...,t^.(^nj) 
EiL"llCo,...,o(^nj) 

with 

Cii,...,tfc := {x £ Eu \ 3j : Xi = yi < i < j, xj+i > VI < i < k}. 
Obviously, a limit theorem for the empirical process 

Znih, ...,tk):= Z„(lcj^ _j^), ti, . . . G [0, 1], 
is useful for the asymptotic analysis of the above estimator. 

Example 4.2 (Order statistics of cluster values). Let 

k 

Ai,...,tfc := Pi Ej^t. 

with 

Ej^tj ■■= |(xi,...,Xm) G Eu m gN,^ 1(4^,^1] (xi) > j|, 

that is, -Dti,...,tj. contains all vectors of arbitrary length such that the jth 
largest value exceeds tj for all i < j < k. Then the empirical process 

Zn{ti, . . . ,tk) = Zn{'^Dt^,...,tf,) describes the standardized joint empirical sur- 
vival function of the k largest order statistics of the cluster cores. 
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Next we discuss the conditions imposed in Theorem 2.10 to ensure con- 
vergence of the empirical processes considered in this section. 

The conditions (Dl) and (D2') are trivial, and condition (CI) holds by 
Lemma 5.2(ii). 

If rnVn — )• [which is a part of assumption (Bl)], then (C3) is equivalent 

to 

(4.1) ^P{y„,iGCnZ)}^c(lc,lD), 

since CovilciYn), IniYn)) = P{Yn £CnD}- P{Yn G C} ■ P{Yn G D} and 
since P{Yn G C} ■ P{Yn £ D} = 0((r,z;„)2) = oirnVn). 
Similarly, condition (D3) can be reformulated as 

(4.2) limlimsup sup ^— P{y„ G CAI?} = 0, 

54,0 n-5.oo C,DeC,pciC,D)<S ''^nVn 

where CAD = {C\D)U {D\C) denotes the symmetric difference between 
C and D and pc is a semi-metric on C that induces a semi-metric p on 
via p{1c,1d) ■■=pc{C,D). 
If (C3") holds, then 

-^P{Yn G CAD} P{{Wi)i>i G CAD} - P{{Wi)i>2 G CAD}, 

where (Wi)j>i G CAD is interpreted as iWi)i<i<m G CAD for some m > 
mw, that is, Wi = for all i > m. If the following continuity property holds 

lim sup P{{Wi)i>i£CAD}-P{{Wi)i>2£CAD} = 0, 

^-^0 C,DeC,pciC,D)<S 

then results by Fabian (1970) may help to conclude (D3). However, in the 
examples of this section we will verify (D3) in a more direct way. 

Finally, if C is a VC-class, then condition (D6') is fulfilled (cf. Remark 
2.11). 

The following result gives conditions for the convergence of the empirical 
processes in Examples 4.1 and 4.2. Here we assume that the random variables 
Xn,i are [0, l]-valued so that it suffices to consider the processes Z„ with 
index set [0, 1]'^. If the r.v.'s Xn^i are standardized excesses defined in (1.1) 
(as we assume in the second part of the following corollary), then this can 
be achieved by a simple quantile transformation (cf. Example 3.1). 

Corollary 4.3. (i) Let Zn{ti,. . . ,tk) be as in Examples 4- i or 4-2, with 
tie [0,l],i = l,...,k, and suppose (Bl), (B2), (B3), (C3.1") and (D3) hold 

withp{lc,^ .,.>lcti,...,tj ■=T!l=i\si-ti\, respectively, p(1d,^,...,,^. , iDi^,...,^ J := 
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^i=i\si — ti\. Then Zn converges to a continuous Gaussian process. If 
is as in Example 4-1, then the covariance function of the process is 

c{{si,...,Sk),{ti,...,tk)) 

(4-3) = P{{Wi)i>i £ Cmax(si,ti),...,max(sfe,tfc)} 

— P{{Wi)i>2 G C'inax(si,ti),...,max(sfe,tfc)}) 

and if Zn is as in Example ^.2, then the covariance function of the process 
is 

c((si,...,Sfc),(tl,...,tfc)) 

(4.4) 

= p|(TyOi>l G fl ^,,max(.„t,)| - i'|(Wi)i>2 G n %max(.„t,)| 

(ii) More specifically, assume that the r.v. 's Xn,i are standardized excesses 
of a uniformly distributed univariate stationary time series ( as in Example 
3.1) and that all finite- dimensional marginal distributions belong to the do- 
main of attraction of some extreme value distribution. Then the assertions 
of part (i) hold true if the conditions (Bl), (B2) and (B3) are satisfied. 



In Example 4.1 we only considered the first k "extremes" in each cluster, 
where /c is a fixed number. Since for most time series the cluster size is not 
bounded, the resulting empirical process does not give a full picture of the 
stochastic behavior of the clusters. To overcome this drawback, in the final 
example we define and analyze an empirical process of cluster functionals 
that takes all values of each cluster into account. As the cluster length is 
random, this requires work with a quite complex index set. 

Example 4.4 (Joint distribution of all cluster values). Recalling the 
notation L{x) for the length, say, j, of the core = {x\, . . . ,Xj) of a vector 
X, we set 

Cj,tu...,t, ■■= {x£Eu\ L(x) = j,xf G [0,i,],Vl < i<j}. 

Then the empirical process Zn{j,ti, . . . ,tj) := Znilcjt-^ t )' j 0, 
describes the joint distribution of all the values in a cluster. 

Like in Corollary 4.3 (ii), for simplicity, we focus on the case that the clus- 
ters are based on standardized exceedances Xn,i of a uniformly distributed 
stationary time series (Xj)jgN) such that all finite-dimensional marginal dis- 
tributions belong to the domain of attraction of some extreme value distri- 
bution. However, it is not difficult to generalize this result to a slightly more 
general setting which is analog to the one considered in Corollary 4.3 (i). 
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Suppose that (Bl), (B2) and (B3) hold, and that 

(4.5) E{L{Yny+^\Yn^0) = Op{l) some C > 0. 

Then Zn converges weakly to a continuous Gaussian process with covariance 
function 

c{{j,si,...,Sj),{k,ti,...,tk)) 

(4.6) = 6j^k{P{L{W) = k,Wi<SiA uyi <i<k} 

- P{L(t^(2;oo)) ^ ((^(2;oo))C^, < ti, VI < i < k}), 

where 6j k is one if j = k and zero otherwise. 

The proof of this uniform central limit theorem is given in Section 5. 

5. Proofs. In this section we prove the results from Sections 2-4. We 
start with fidi convergence, then consider asymptotic tightness and asymp- 
totic equicontinuity, and finally prove the corollaries from Sections 3 and 
4. 

The first step in the proof of fidi convergence is to use mixing to bring 
the problem back to classical limit theory for i.i.d. variables. Let Y*^ denote 
i.i.d. copies of the original blocks Ynj (which are identically distributed, but 
are not assumed to be independent — and which in interesting cases typically 
are dependent). 

Lemma 5.1. Suppose (Bl) , (B2) and (CI) are satisfied. Then the fidis of 
{Zn{f)) f^jr converge weakly if and only if the fidis of the sums of independent 
blocks 

Z*M) := — = Y^ifiYZ,) - EfiYl^)), f G J", 

converge weakly. In this case the limit distributions are the same. 
Proof. Let 

KM) ■= fiKj) - f{{y:,t''-'"h 1 < i < m^. 

and let Anj{f) be defined in the same way, but instead based on the origi- 
nal (dependent) blocks, so that A^ j{f) = A„j(/) = A.n{f) for each j, with 
A„(/) as in (CI). By Theorem 1 in Petrov [(1975), Section IX.l] applied to 
the i.i.d. random variables Xnk ■= ('^^^n)~^^^A* ^(/), condition (CI) implies 
that 

(5.1) —= ^(A;^.(/) - EAl^^if)) = op(l) V/ G 

V j = l 
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We next prove the analogous convergence for the dependent random vari- 
ables, that is, that 

(5.2) —= V(A„,,(/) - i?A„,,(/)) = op(l) V/ G F. 

V j = l 

Using Theorem 1 in Petrov [(1975), Section IX. 1] again, it also follows 
from (CI) that the convergence analogous to (5.1) holds for the sums of the 
even numbered blocks 

(5.3) — = iA:,2jif)-EAl^2jif)) = opil). 

Since the even numbered blocks Yn,j are separated by r„ observations, a 
well-known inequality for the total variation distance [cf. Eberlein (1984)] 
between the joint distributions of dependent observations and independent 
copies yields 

(5.4) ||p(^">2i)i<i<L™„/2j _ p(^„*,2i)i<j<L™n/2j 11^^ < Lrn„/2j/3„,^„ 
by (B2). Combining (5.3) with (5.4), we arrive at 

\m^l2\ 

= (A„,2,(/)-ii;A„,2,(/)) = op(l). 



/nv. 



n 



Together with the analogous convergence for the sum over the odd numbered 
blocks, this proves (5.2). 

Thus, the fidis of converge if and only if the fidis of 



ZnU) ■=Zn{f) - ^= J2(^nAf) " EA^AD) 

V j = l 

V j = l 

converge, and in this case the limiting distributions are the same. Similarly, 
by (5.1), the corresponding assertion holds for the sums over the independent 
blocks, and then the lemma follows from the inequality for the total variation 
distance, since it implies that 

||p(^i;'7~'"')l<.<m„ _ p{(n:,,)('-"-'"')i<j<m„ 11^^ < rUnl^nU 

by (B2), since the shortened blocks Y^^ are separated by In observa- 
tions. □ 
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Proof of Theorem 2.3. The assertion follows from Lemma 5.1 and 
and the multivariate central limit theorem for triangular arrays of row-wise 
independent random vectors applied to . . . , Z*{fk)). □ 

Next we present a useful technical lemma. It makes it possible to replace 
some of the assumptions of Theorem 2.3 by sufficient conditions which are 
more restrictive but often simpler to verify. 

Lemma 5.2. (i) // Var(A„(/)) = o(r„?;„), then (CI) holds. 

(ii) // „ — )• oo and ||/||oo sup^.g^^ |/(x)| < oo, then (CI) and (C2) 
hold. 

(iii) // rnVn and 

(5.5) J-E{f{Yn)g{Yn))^c{f,g) yf,geT, 

then (C3) holds. 

(iv) // 

(5.6) E{f{Ynfl{\j^y-^)\^,^^)=o{rnVn) Ve > 0, / G ^, 

then (C2) holds. 

(v) Ifnvn oo and (/(y„)^)„gp^ is uniformly integrable under P{-)/ {rnVn) 
for all f ^ F, then (C2) holds. 

(vi) // E{f{YnY^^) = 0{rnVn) for some 5 > and all f ^ T , then 
if {Yn)'^)'^=i is uniformly integrable under P(-)/(r„?;„) for all f . 

(vii) //(Bl) and (B3) hold, i/ien limfc^oo limsup„_^oo ;^-i^P{L(y„) > A;} = 
and the cluster lengths {L{Yn))n£N o-i"^ tight under P{-\Yn ^ 0). 

Proof, (i) The first equation in (CI) follows at once, and the second 
one by using Chebyshev's inequality. 

(ii) Under these conditions, (C2) obviously holds. Moreover, (CI) follows 
by (i), since |A.„(/)| < 2||/||oo1{a„(/)7^o} implies 

Yar{AM))<EAl{f) 

<4||/||LP{A„(/)/0} 

= 0{P{Xn,^ / for some r^i — In 

= 0{lnVn) 

= o{rnUn). 

(iii) By(5.5),P{K„/0}< 

i^nVn ~^ and the Cauchy— Schwarz inequality, 

we have that 
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(5.7) 

,1/2 

<(^E{fiYnf)P{Yn^O} 

\rnVn 

for f £J-. (C3) then follows readily from (5.5). 

(iv) By (5.6), for any e > 0, 

Hence, Ef{Yn) = o{^nVn), and (C2) then follows from (5.6) by standard 
reasoning. 

(v) By uniform integrability, n/r.„ — )■ oo and Chebyshev's inequality, 

(- n/rn 

Using uniform integrability again, it follows that £^(/(l^)^l{|^(y^)|>^^/^^})/ 
{fnVn) 0, so that (5.6) is satisfied. The result then follows from part (iv). 

(vi) This is a well-known fact. 

(vii) Let := Yll=s+i ^{x„ i^to} t>e the number of nonvanishing obser- 
vations in the time interval from s + 1 to t and write Fn^i = {^n,i = • • • = 
Xn,i-i = 0, Xn,i 7^ 0},i > 2, and Fn^i = {^n,i / 0} for the events that the 
first nonzero value in row n occurs at position i. Then 

P{L{Yn) >k}=Y^ P{L{Yn) > k I Fn,i)P{Fn,i) 
i=l 



Y,PiM:-_,,y^O\F,,)P{F^ 



i=l 



i=l 



The result then follows from (B3) and r„f^ 



Proof of Corollary 2.4. The first assertion follows if we prove 
that (C3') implies (C3). However, using that \E{f {Yn)g{Yn)l{L{Y„)>k})\ < 
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{E{f{Ynf X l^LiY„)^k})E{giYr,)H{L{Y„)^k}))^^^ it follows from (2.1) and 
(2.2) that 

-l-E{f{YMYn)) = -^EifiYr,)giYn)l{LiY^)<k}) 

+ -^E{f{Yn)g{Yn)l^LiYr.)>k}) 

with limfc_>oo l™sup„_^oo i?^ ^ = 0. A standard subsequence argument then 
shows that c{f,g) := limfc_!.oo Cfc(/, 5) exits, and that 

lim J-E{f{Yn)g{Yn)) = c{f,g). 

By Lemma 5.2(iii), it then follows that (C3) holds. 

Now suppose instead that (Bl), (B2), (B3), (CI) and (C3") hold. Assump- 
tion (C2) then follows from Lemma 5.2(v), and, hence, only (C3) remains to 
be established. By Lemma 2.5(ii) and (iii). On = P{Yn / 0}/(r„?;„) — )• > 
and p{/9)(n.)|yn^o 

converges weakly to fJ-fg^w- Thus, the uniform integrabil- 
ity of {fg){Yn) under P{-)/{rnVn) is equivalent to the uniform integrability 
under P{Yn ^ 0) so that 

1 ^(>^n / 0) 



E{f{YMYn)) = ^^^^^E{f{Yn)g{Yn) | y„ / 0) ^ ^ / x/z/,,H^(da 



= E{{fg){W)-{fg){W^^-'^^)). 

It then follows from Lemma 5.2(iii) that (C3) holds with c{f,g) given by 
(2.4). □ 

Proof of Lemma 2.5. Again let M* := Yll=s+i ^{x„ ^7^0} denote the 
number of nonvanishing observations in the time interval from s + 1 to t. 
Then 

(5.8) lim sup P(M;;) / I / 0) < lim sup(/3„,i + r^Vn) ^ 

as I — )■ 00, by (B3) and r„v„ — )• 0. Hence, the analog to condition (2) of Segers 
(2003) holds and one may conclude the assertions (i) and (ii) by essentially 
the same arguments as given for the proofs of Theorem 1 (with t„ = r„). 
Corollary 2 and Theorem 3(i) there. 

The proof of (iii) also follows the ideas used in the proof of Theorem 3(ii) 
in that paper. Nevertheless, we give more details, since we want to avoid 
working with the space A of sequences with almost all terms equal to 
that was introduced by Segers (2003). Moreover, in this proof we replace 
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assumption (2.3) in condition (C3.1") by the weaker assumptions (2.7) and 

(2.8) . 

We first consider a bounded cluster functional g such that Dgm,l C Df^^j 
for all ?Ti G N and I C {1, . . . , m}. The result for / itself will then follow easily. 

Let A; S N be arbitrary and, as before, let || ||tv denote the total variation 

distance between two measures. By (5.8), for all e > there exists / > k such 

that for sufficiently large n and xit^ = (^n,j)i<i<A; 

l|p(4'=) G -, m;;;, = 1 x„,i / 0) - p(x('=) G ., M^,, = 1 x„,i ^ o)||tv 

(5.9) < P(M^;, / I Xn,i / 0) 
and, by (2.6), 

g = 0} g •,Ty('=+i'0 = o}||tv 

(5.10) < / for some i > 

Recall the definition of the sets Nkj for / C {1,...,A;} from Remark 2.6. 
Since, according to assumption (C3.1"), the substochastic measures 
•, X^^ G Nkj, fc = I 7^ 0) converge weakly to the substochastic mea- 
sure P{W^'''> G ■,W^''^ G Nkj,W^''+^'^^ = 0}, it follows from (5.9) and (5.10) 
that, for all A; G N, and all subsets I C {1, . . . ,k}, 

P{xi^^ G -,4'^ e A^fc,/, = I Xn,i + 0) 

(5.11) 

^ P{VF('=) G -.W^^^ G A^fc,/, = 0} 

weakly. 

By assertion (i), we have 

(5.12) E{g{Y,,) I K„ / 0) = ^E{g{xt-~^) - g{Xt^-y) \ ^ 0) + o(l). 

Again by (5.9) and the definition of a cluster functional, 

|P(5(X(^'"))-ff(Xf'-'^))|X„,i/0) 

(5.13) 

- i?((5(X«) - 9(X(2;')))l^,,.„_o^ I / 0)1 < 2e||5||oo. 

In view of (5.11) (with k = l), for all / C {1, . . . ,1}, the continuous mapping 
theorem yields 

^(9{X^^)l^^(i)^^^ ^yl{M^n=0} I ^n,l 7^ 0) ^(9(W^'^)l{VF{0eA',,/}l{ty('+ii°°) = 
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because the function qIni j is bounded and continuous on the complement 
of the set Df,i,i, which by (2.7) is a null set under the limit measure in 
(5.11). Sum up these equations for all I C {1, ■ ■ ■ ,1} and combine this with 

(2-1) 

an analogous result for g{Xn ) to obtain 

E((5(x«) - <7(x(2;0))i^^._^^ I /o) 

(5.14) 

{iy(i+i;oo)=0}J- 

Combining (5.10), (5.12)-(5.14) and 0„ — t- > 0, one arrives at 

(5.15) E{g{Y^) I / 0) ^ ^EigiW) - giW^^''^'^)). 

Now, if / is an arbitrary cluster functional satisfying the conditions of the 
proposition and /i : M — t- M is continuous and bounded, then an application 
of (5.15) with g = ho f yields assertion (iii). □ 

Proof of Corollary 2.7. This is immediate from Corollary 2.4 and 
Lemma 5.2(ii). □ 

Proof of Theorem 2.8. The processes are asymptotically tight 
if the analogous sums over the even numbered and over the odd numbered 
blocks 

-== Yl U(yn,2j)-Ef{Yn,2,)) and 

(5.16) ^~ 



== E {f{Yn^2,-l)-Ef{Yn^2,.,)) 



V j = l 

are asymptotically tight. In view of (5.4), the first expression is asymptoti- 
cally tight if and only if the analogous expression with independent blocks, 
that is, 

(5.17) — = ifiY:,2j)-EfiYl2j)) 



is asymptotically tight, which follows from Theorem 2.11.9 of van der Vaart 
and Wellner (1996) applied with Zni{f) = /(5^n,2j) (and m„ replaced with 
[m„/2j). Observe that for a sequence of monotonically increasing positive 
functions T„((5) the convergence of Tn{5n) to for all sequences (5„ | is 
equivalent to lim^^o li™sup„_^oo T„((5) =0, so that the last two displayed 
conditions in Theorem 2.11.9 of van der Vaart and Wellner (1996) can be 
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reformulated as (D3) and (D4), respectively. The proof of tightness of the 
sum over the blocks with odd numbers is the same. □ 

Proof of Remark 2.9(ii). By the Cauchy-Schwarz inequality, 

E* {F{Yn)l{F{Yn)>e,/mj^}) 

< {E*{F^{Yn)l{F(Yn)>ey^}) ' E*l{F{Y„)>ey^})^^'^ 



< 



e^nvn 



\2\ 1/2 

o 



(rnVnf 



nvr. 



= o{rn\/vn/n), 

so (D2) holds. Further, (D2') implies (5.6), and, hence, (C2) follows from 
Lemma 5.2(iv). 

Next, suppose E* F"^^^ {Yn) = 0(r„f.„) and nVn — >• oo. Then 

^*(-^^(^n)l{F{y„)>eVm^}) 



< 



(^*^2+.(^^))2/(2+.) . (^*,^^^^^^^^^^^)l-2/(2+.) 



E*F'^+^{Yn) 



l-2/(2+<5) 



= 0{rnVn{nVn)~^) 
= o{rnVn), 

so that (D2') holds. □ 

Proof of Theorem 2.10. First assume (D6) holds. Using the trian- 
gle inequality, it is easily seen that Z„ is asymptotically equicontinuous if 
both terms given in (5.16) are asymptotically equicontinuous. Further, by 
(5.4), the first term is asymptotically equicontinuous if and only if (5.17) is 
asymptotically equicontinuous. However, asymptotic equicontinuity of (5.17) 
follows from Theorem 2.11.1 of van der Vaart and Wellner (1996). To see 
this, note that (D6) implies the analogous random entropy condition for the 
sums over the even numbered blocks, because the corresponding random 
semi- metric is smaller for these sums. 

If iTin is even, then the second term in (5.16) has the same distribution 
as the first one, while for m„ odd with probability greater than or equal to 
1 - rnVn 1, the additional summand inVn)~'^^'^ifiYn,mn) - Ef{Yn,mn)) 
equals —{nVn)~^^'^Ef{Yn'mJ)i which tends to uniformly for f ^ F [cf. 
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(5.7)]. This proves the first assertion of the theorem. Theorem 2.3 then 
yields the convergence of Z„, because the Lindeberg condition (C2) follows 
from (D2) [see Remark 2.9(ii)]. 

Next, to see that (D6') implies (D6), check that the random semi- metric 
dn can be represented as d„ = {"mn/ {nvn))^^"^ ■ dq with the (random) proba- 
bility measure Q = 'm~^YlY=i^Y*., and, hence, N{e,T,dn) = 

N{e{nVn/mnY/\T,dQ). If f dQ = 0, then dnif,g) = for ah f,g G T 
and the integral in (D6') vanishes. Otherwise, for all 77 > there exists a 
r > such that, for sufficiently large n, 

P{ (/ F^dQ^ > r(n^„/m„)V2| < EF\Y^,,) 



< V, 



T'^nvn/rrir, 

since EF'^{Yn) = 0{rnVn), and thus with probability larger than 1 — r], 

/ ^logN{e,F',dn)de = t 0ogiV(er, F',dn)de 
Jo Jo 



< T 

^0 



^ ^ supyiogAf(^e(^y F^dQ^ ' ,T,dQ^de 



as 510, under (D6'). □ 



Proof of Corollary 3.6. Condition (Dl) is satisfied since F{xi, . . ., 
Xk) < Sf=i i'raaxixi) and since (/>max is assumed to be measurable and bounded. 
Similarly, condition (D2') follows from FiYn) < ?'n||<^max||oo5 since r„ = o{y/nVn) 
by assumption. 

By Lemma 5.2(i), assumption (CI) follows if we show that Var(A„(/)) = 

0{rnVn)- Now, 

by the row-wise stationarity, and, consequently, by (3.5) and Z„ = o(r„), 

/J \ 2 



< 



.1=1 
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= o( — rnVnj =o{rnVn) 



Further, (3.6) follows from 



i=r„m„+l 



\ 2 



_ 4||(/)max||^ „ 
— ' n'-'n ^ 

nVn 

Therefore, the remaining assertions follow from Theorems 2.8 and 2.10 
and Remark 2.9(i) and (ii). □ 



Proof of Remark 3.7(i). Since 

2 

E{g(t,{Yn) l{L(y„)>fc}) 



/ 2/(2+5) e/^^^,) 

the first part (2.1) of (C3') follows from (3.7) and (3.9), since cj) is assumed 
to be bounded. Next, 

E{g^{Yn)g^{Yn)l[L{Yr,)<k}) 



(5-18) =^ Yl E{^{Xn,^mXn,J)l{L{Y„)<k}) 

" " i,j£{l,-,rn},\i-j\<k-l 

= —E{cl){Xn,imXn,l)) 
Vn 

fe-1 . 
1=1 
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with 



'n,k I 



^ E{(l){Xn,i)ll^{Xn,j)l{L(Y.,,)>k}) 
j,je{l,...,r„},|i-i|<fc-l 



< 



oo II V'lloo 



{X^,il^O} -L{L(y„)>fe} 



It then fohows as above that hmfc^oo li™sup„^j,o|i?n,fc| = 0, and, hence, the 
assumption (2.2) of (C3') can be seen to be satisfied, with c given by (3.10). 

□ 



Proof of Corollary 3.9. Clearly, (3.15) implies (2.10) and hence 
also (D2'). Moreover, (3.15) implies that 



2+5 



E '^max(^n,i) < ^ <^max(^n,0 + <Y. '^max(^n,*) < 1 



-.1=1 



i=l 



■ 0{rnVn) 



Hence, similar arguments as used in the proof of Corollary 3.6 show that 
{Zn{g(f)))(j)e^ converges weakly to a Gaussian process. Finally, (3.6) and thus 
the convergence of {Zn{(p)) (pe^ follows from 

2 



E* sup ■ 



1 



.4>e^ V " 



J2 {HXn,i) - Ect>{Xn,i)) 
j=r„m„ + l 



<e( X](0max(^n,*) + ^0max(^n,i)) ) 



^ ^ £'Amax(X„,i) 



Oirn/n)^0. 



□ 



Proof of Corollary 4.3. (i) The index set C := {Ctj^^,,,^tk | ii, ■ • ■ e 
[0,1]} equipped with the metric Pci'^C,^,...,,^,'i-Cti,...,tJ := maxi<i<fc |s/ - ti\ 
is totally bounded. The same holds for V := {-Dti,...,*^. | ti, . . . , tfc E [0, 1]}. 

In view of the discussion preceding Corollary 4.3, the assertions follow 
from Theorem 2.10 combined with Corollary 2.7 if we verify condition (D5) 
and that the index sets C and T> are VC-classes. Condition (D5) is satisfied 
since all processes under consideration are separable. 
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That C is a VC-class may be established by observing that Ct^^,„^tk = 
V'-HX ti(*;>oo)) with 

(Xj, . . . , 

if j = min{i | 7^ 0} < m — A; + 1, 
(0,...,0), 
else. 

Since {X f=i(ti,oo) | ti, • • • , tfc > 0} is known to be a VC-class [of. van der 
Vaart and Wellner (1996), Example 2.6.1], C is a VC-class, too [van der 
Vaart and Wellner (1996), Lemma 2.6.17(v)]. 

The sets Vj := {Ej^t \t'>0} are linearly ordered (i.e., Ej^g C Ej^t if s > t) 
and, hence, they are VC-classes, and hence so is 

P = Pi n P2 n • • • n Pfe = I Pi I G I 

[van der Vaart and Wellner (1996), Lemma 2.6.17(ii)]. 

(ii) By the results of Segers (2003), condition (C3.1") is satisfied in the 
weaker version discussed in Remark 2.6, because the limit r.v.'s are contin- 
uous on (0, 00) and the discontinuity sets have Lebesgue measure 0. Hence, 
the assertions follow by part (i), if the asymptotic equicontinuity condition 
(D3) can be shown. 

For this, first note that Csj^,,,_Sfe ACj^^...^^^ C {(a^i, . . . , Xm) S -E'u | m G N, 30 < 
j <m — k,l < I < k: Xi = 0,yi < i < j, xj^i £ (min(si, t;), max(s;, t;)]}. Thus, 
Lemma 2.5(i) and (ii) yield that 

-^P{YneCs,,...,s,ACt,,...,tJ 

< e Aa„„„i, I x„,i ^ 0) • P{y„ ^ 0} 



+ 



TnVn 

= P(X('-") G C,,,...,,, Aa,,...,i, I X„,i / 0) + 0(1) 

k 

<^P{Xn^l G (min(si,t;),max(si,i;)] I Xn,i^^) + o{l) 



1=1 
k 



<Y,P{X^^l G (min(s;,tO,max(s;,tO] I X^^i^Q) ■ ^fei^ + o(l) 

k 

= ^|i«-5i|+o(l), 

1=1 
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where the term o(l) tends to uniformly for all si, . . . ,Sk,ti, . .. ,tk G[0, 1]. 
Now, (D3) follows immediately from the definition of pc- 

To verify condition (D3) for the indicator functions describing the largest 
order statistics in a cluster, note that 

k k 

i=i i=i 

C < (xi, . . .,Xm) e Eu\meN, 



X] l{mm{sj,t,),l] i^i) ^ J' ^ l(max{^*j,t,),l] (Xi) < j for SOme l<j<k 
i=l i=l > 

C {{xi, . . .,Xm) £ Eu\m£N, 

Xi G {mm{sj,tj),max{sj,tj)] for some l<j<A;,l<i< m}. 
This implies 



-^plYnenE,,s,Ar]EjA 



k 

< ^'^P{Xn,i G {mm{sj,tj),max{sj,tj)] \ Xn,i / 0) 
i=i 

from which (D3) follows. □ 

Proof of the result in Example 4.4. The convergence of the fidis 
of Zn to those of a Gaussian process with covariance function (4.6) follows 
from Corollary 2.7 by the same arguments as in the proof of Corollary 4.3(ii). 

In view of the discussion before Corollary 4.3, the proof will be completed 
by showing that conditions (D3), (D5) and (D6) of the asymptotic equicon- 
tinuity Theorem 2.10 also are satisfied. The measurability condition (D5) 
holds since, for fixed k, the processes (Icfet^ )(ii,...,ife)e[o,i]'° separa- 
ble and a supremum of countably many suprema of separable processes are 
measurable. 

We will use (4.2) to verify that (D3) is satisfied for the semi-metric 
P(lc,„,,...,,^,lCfc,tj,...,tJ 

_^(P{L{W)e{j,k}}, ifj^k, 
■ \P{L{W) = k,Wie {si A ti, Si Vti] ioT some l<i<k}, iij = k. 
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Now, T = {Icfet^ I > I)ii5i2;--- G [0,1]} is totally bounded with re- 
spect to p. To see this, for e > given, choose = Oj^o < < • • • < aj^m; = 1 
such that P{Wi £ (aij_i, ajj]} < e/fc^ for 1 < i < and l< j < rrii, with 
chosen large enough to make P{L{W) > fee} < e/2. Then 

I k > k,}, {lc,,i,,...,i^, I ti G [ai,£^_i,ai,£j,Vl < i<j}, 

for 1 < j < k^,l < ii < rrii, is a finite cover of with diameter at most e. 
By Lemma 2.5, 

(5.19) P(L(y„) = A; I y„ / 0) ^ 1{P{L{W) = k} - P{L(P^(2;oo)) ^ 

and, by Sheffe's lemma, the convergence is uniform in G N. (Note that, for 
k < I, the cluster functional Ij^} o L is constant on all sets Nij defined in 
Remark 2.6.) Similarly, 

p(L(y„) = k, (y„^)i < ti, . . . , (y„^)fc < | y„ / o) 

(5.20) ^ 1{P{L{W) = k,Wi<h,...,Wk< tk} 

- P{L(l^(2;oo)) ^ ((p^(2;oo))C^, < ti, VI < i < /fc}), 

and the convergence is uniform in for each fixed k, because the 

right-hand side defines a continuous function. 

For e>0 let 6 = e/2 and consider j,ti, . . . ,tj,k,ti, . . . ,tk such that 
/'(lc,,,j,...,,^.,lCfc,t^,...,tJ < Then for j^k and n large, 

^P{y„ G C,,,,,...,,^.ACfc,f,,...,tJ < ^P{L(y„) G {j,k}} 

= enP{L{Yn)e{j,k}\Yn^O)<e 

by (5.19), Lemma 2.5(ii) and the definition of p. 
If instead j = k <ke, then using (5.20), for large n, 

——P{Yn G Cj^si,...,Sj^Ckfy,...,tk} 

= enP{L{Yn) = k, {Y^)i G {si A ti, Si V U] for some l<i<k\Y^^O) 

- ^" = A;, Wi G {si A ti. Si V ti] for some 1 < i < A:} + < e, 

again by Lemma 2.5 and the definition of p. 
Finally, \i j = k > k^, then for large n 

-^P{Yn G Cj,s,,...,s,^Ck,u,...,t,) < P{L{Yn) = A: I y„ / 0) 

< 2P{L{W) > k,) < e. 
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This concludes the proof of (4.2), and hence also the proof of (D3). 

For the proof of (D6), let Ck = {Cj^tu...,t, \ 'i- < j < k,ti, . . . ,tj e [0, 1]} and 
J^k = {Ic I C £ Ck} SO that T = Ufc^i-^fc- Define V'fc as the function which 
maps X e Eij to the vector (1, . . . , 1) in M?'' if L{x) > or L{x) = and 
which maps x to the vector 

(l,...,l,0,l,...,l,x5,...,x,^0,...,0)GM2fc^ 

if 1 < L{x) := j < k. Here the first row of ones has j — 1 entries and the 
second row has k — j entries, and, hence, the vector ends with k — j zeros, 
so that the first k components encode the length of the cluster core. With 
this definition, it follows that 

Cj,t,,...,t, =ipk^ (^^'^ X (-00,0] X M^--'' X X (-oo,f,] X R'^--''^ . 

The left orthants X •=!( —oo,Xi] form a VC-class with index bounded by 
2k + 1 [van der Vaart and Wellner (1996), Example 2.6.1] and, hence, also Ck 
is a VC-class with index bounded by 2A; + 1 [Dudley (1999), Theorem 4.2.3]. 
By van der Vaart and Wellner [(1996), Theorem 2.6.7] for all sufficiently 
small e and all k gN, Fk satisfies the metric entropy bound 

1 /2 

nU I F'cIq] <C(2A; + l)(16e)2'=+ie-(4fc+i) 

(5.21) ^ ^ ^ 

with C denoting a universal constant that does not depend on k or e. 

Let Ln^i > Ln^2 > • • • > Ln^m„ be the order statistics in descending order 
of the independent cluster lengths (L(Y^j ))™I'j^. Since the empirical L2-semi- 
metric dn satisfies 

sup ^i^(ic,,,,...,,,,ic,,., ^ — Eiwn:,.)>4' 

i,j>k n j—-^ 

it follows that the squared diameter of the set 

{Cj,ti,...,tj I j > -^n,[e2ni;„J)*l> ■ ■ ■ ,tj & [0, 1]} 

w.r.t. dn is bounded by 



1 '"'I I 2 

'"Vn — 



Reasoning as in the last part of the proof of Theorem 2.10, this together 
with (5.21) shows that (D6) follows if we prove that 

(5.22) limlimsupp/ / \/loge~^®^".L^"™nJ+^^ de>T\=0 

SiO n^oo [Jo J 
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for all r > 0. By a change of variables and Holder's inequality, 

Jo 

r*^™"! /■((i+i)/(™„))i/2 

< V ySL" / \/|loge|de 



2 ^''^^ /■(i+i)/(™n) / _ 

< y J Ln,j ■ nvn / Jlogrj-^/'^r] ' drj 



< 



\5nv„] \ 1/(2+20 



X 



' ^ r^™..! . .(,+l)/(n.„) - 



x(2+2C)/(l+2C)\ (1+20/(2+20 

X r/"^/^ dri\ 

Now, 



which is bounded by (4.5). Furthermore, applying Liapunov's inequality to 
the individual summands, 



I r-^™'"! / r{j+i)/{nv,,) I \ (2+20/(1+20 

— {nVn I v/logr/~i/2j^^i/2 (^^^ I 

™" j=l V Jj/{nvn) ) 



^ r-J^nl /■(i+l)/(n^„)/Mog^|\ (1+0/(1+20 

^ A7(ni^„) V ^ / 

as (5 — >• 0. Hence, we have verified (5.22). This concludes the proof of (D6). 
□ 
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